Combining task-dependent information with auditory attention cues for prominence detection in speech
نویسندگان
چکیده
Auditory attention is a highly complex mechanism that involves the process of low-level acoustic features of sound together with higher level cognitive rules. In this paper, a novel method that combines biologically inspired auditory attention cues with higher level lexical and syntactic information is proposed to model task-dependent influences on a given task. The feature maps are extracted from sound at multi-scales by mimicking the processing stages in the human auditory system, and converted to low-level auditory gist features. Then, the auditory attention model biases the gist features based on the task to maximize target detection. The top-down task-dependent influence of lexical and syntactic information is incorporated into the model using a probabilistic approach. The combined model is tested to detect prominent syllables in speech using the BU Radio News Corpus. The model achieves 88% prominence detection accuracy at syllable level, which is comparable to reported human performance on this task.
منابع مشابه
Attentional Demand of Speech in Children and Adolescents with Developmental Stuttering
Background & Objective: Stuttering is a prevalent disorder in children and adolescents. Because attention is the only fuel resource for cognitive functions and the language have high cognitive functions, then it is possible that speech difficulties are related to attention deficit. The purpose of this study was to investigate the attentional demand of speech in children and adolescents with dev...
متن کاملTitle: Cognitive Processing of Audiovisual Cues to Prominence
This article addresses two related questions regarding the cognitive processing of audiovisual markers of prominence in spoken utterances: (1) how important are visual cues to prominence from the face with respect to verbal cues? and (2) are there differences between different facial areas in their cue value for prosodic prominence? The first perception experiment tackles the relation between a...
متن کاملEyebrow movement as a cue to prominence
INTRODUCTION Speech communication is inherently multimodal in nature. While the auditory modality often provides the phonetic information necessary to convey a linguistic message, the visual modality can qualify the auditory information providing segmental cues on place of articulation, prosodic information concerning prominence and phrasing and extralinguistic information such as signals for t...
متن کاملFacial expression and prosodic prominence: Effects of modality and facial area
This article addresses two related questions regarding the perception of facial markers of prominence in spoken utterances: (1) how important are visual cues to prominence from the face with respect to auditory cues? and (2) are there differences between different facial areas in their cue value for prosodic prominence? The first perception experiment tackles the relation between auditory and v...
متن کاملسایکوآکوستیک و درک گفتار در افراد مبتلا به نوروپاتی شنوایی و افراد طبیعی
Background: The main result of hearing impairment is reduction of speech perception. Patient with auditory neuropathy can hear but they can not understand. Their difficulties have been traced to timing related deficits, revealing the importance of the neural encoding of timing cues for understanding speech. Objective: In the present study psychoacoustic perception (minimal noticeable differen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008